feat(mcp): cap query results at 1000 rows#30
Conversation
The MCP execution tools returned every row, so a broad SELECT could flood an agent's context (and tokens). Cap reads at MAX_RESULT_ROWS (1000): the result is fetched, then capped, and the response carries `truncated` plus the true `rowCount` so the agent knows to narrow (LIMIT/WHERE) — it has SQL, unlike a fixed API. `query` also takes an optional `maxRows` to request fewer (clamped to the hard cap). Scope is MCP-only by decision: the web UI route stays uncapped because it needs the full result set for CSV export and inline editing, and a human driving the grid (with virtualization) doesn't have the agent's context-blowup problem. Pure helpers (capRows, readMaxRows) are unit-tested; docs updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
Introduces a hard row cap for MCP SQL execution tools to prevent oversized result payloads from flooding the agent context window, while keeping the web UI query path uncapped for CSV export and inline editing.
Changes:
- Add a hard cap (
MAX_RESULT_ROWS = 1000) withtruncatedsignaling and additional metadata (rowCount,returnedRows,note) on MCPqueryresults. - Add
maxRowsoption (clamped to the hard cap) and pure helperscapRows/readMaxRows. - Add unit tests for the new helpers and update MCP server documentation to describe the cap behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tests/mcp.test.ts | Adds unit tests covering capRows and readMaxRows behavior and validation. |
| server/mcp.ts | Implements MCP-side row capping, exposes maxRows, and enriches tool responses with truncation metadata. |
| docs/features/mcp-server.mdx | Documents the MCP-only cap and guidance for narrowing queries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const result = await runAndAudit(principal, tool, connection, details, finalSql, rawSql) | ||
| // rowCount is the true total from the server (CommandComplete), independent of capping. | ||
| const rowCount = result.count ?? result.rows.length | ||
| const { rows, truncated } = capRows(result.rows, readMaxRows(args)) | ||
| if (expectedPerm === 'read') { |
There was a problem hiding this comment.
Fixed in 6f554dc. execute() now honors maxRows only when expectedPerm === 'read'; write_data/run_ddl always use the hard cap, so a maxRows smuggled past the (non-enforced) schema no longer changes their behavior.
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
| returnedRows: rows.length, | ||
| truncated, | ||
| ...(truncated | ||
| ? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or set maxRows (≤${MAX_RESULT_ROWS}), to narrow.` } |
There was a problem hiding this comment.
When an agent explicitly passes a smaller
maxRows (e.g. maxRows: 50) and still gets truncated, the note says "set maxRows (≤1000), to narrow" — which reads as an invitation to raise the value toward 1000 rather than a prompt to reduce it further or use SQL predicates. An agent following this hint literally could request more rows than intended. Reflecting the effective cap in the note makes the guidance unambiguous.
| ? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or set maxRows (≤${MAX_RESULT_ROWS}), to narrow.` } | |
| ? { note: `Showing the first ${rows.length} of ${rowCount} rows. Add LIMIT/WHERE, or pass a smaller \`maxRows\` (minimum 1), to narrow.` } |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Reworded in 6f554dc, taking the spirit rather than the literal suggestion: the note now reads "Refine with WHERE/ORDER BY/LIMIT to target the rows you need." I dropped the maxRows mention from the narrowing hint entirely — lowering maxRows just shows fewer rows, not the right ones, so steering toward SQL predicates is the unambiguous guidance. maxRows stays documented in the tool schema for callers who simply want fewer.
| // rowCount is the true total from the server (CommandComplete), independent of capping. | ||
| const rowCount = result.count ?? result.rows.length | ||
| const { rows, truncated } = capRows(result.rows, readMaxRows(args)) |
There was a problem hiding this comment.
rowCount fallback relies on full in-memory materialization — the comment says rowCount is "from CommandComplete, independent of capping," but the fallback result.rows.length is the full in-memory result, not the CommandComplete tag directly. This is functionally correct today only because all rows are materialized before capping (as the PR itself notes in follow-ups). If the cursor optimization is later added without updating this line, result.rows.length could become the cursor batch size instead of the true total, making rowCount silently wrong when result.count is not set. A brief code comment here flagging this dependency would help future maintainers.
There was a problem hiding this comment.
Addressed in 6f554dc with a clarifying comment. You are right it is correct only under full materialization — the comment now explicitly flags that the result.rows.length fallback is the true total only because the result is materialized before capping, and that a future cursor-based fetch must revisit it.
…comment Address PR review: - maxRows is a query-only knob, but execute() applied readMaxRows() for all execution tools; since MCP tool schemas aren't enforced at runtime, a maxRows smuggled into write_data/run_ddl was honored. Honor it only for reads (`expectedPerm === 'read'`); other tools always get the hard cap. - Reworded the truncation note: it said "set maxRows (≤1000)", which read as an invitation to raise the value. Now points to WHERE/ORDER BY/LIMIT to target the needed rows. - Tightened the rowCount comment to flag that the result.rows.length fallback is the true total only under full materialization, so a future cursor change must revisit it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Why
The MCP execution tools returned every row. A broad
SELECT * FROM eventswould serialize the whole result into one JSON payload and ship it to the agent — blowing its context window and burning tokens. (Diagnosed while reviewing how desktop SQL clients handle large results: DBeaver caps at 200, DataGrip at 500, etc.)What
Cap the MCP read path at
MAX_RESULT_ROWS = 1000:truncated: trueand the truerowCount(fromCommandComplete, independent of capping), plusreturnedRowsand anotetelling the agent to narrow withLIMIT/WHERE.querygains an optionalmaxRowsarg to request fewer (clamped to the hard cap).write_data/run_ddlalso cap their RETURNING rows, surfacingtruncatedonly when it actually bit.The agent has SQL, so — unlike a fixed-API tool that needs server-side
jqfiltering — the right lever is a safety cap plus a clear signal to refine the query.Scope: MCP only (by decision)
The web UI route stays uncapped. It needs the full result set for CSV export (
QueryResults.tsxexports client-side rows) and inline editing, and a human driving a virtualized grid doesn't have the agent's context-blowup problem. Capping the UI would silently truncate exports — so it's deliberately out of scope. The cap constant lives inmcp.tsand isn't imposed on the RPC path.Tests
capRows/readMaxRows(pass-through, exact-cap boundary, truncation, default, clamp-down, reject non-positive/non-integer).tests/mcp.test.ts(39) +tests/execute-sql.test.ts(9) = 48 pass;tsc --noEmitclean;pnpm build:serversucceeds.docs/features/mcp-server.mdx) updated.Follow-ups (noted, not in this PR)
postgres.jscursor (.cursor(cap+1)) so pathologicalSELECT *doesn't materialize fully server-side first — deferred because it rewires the result path and needs live-DB validation.jsonb/byteavalues within the row cap.🤖 Generated with Claude Code